AITopics | continuous speech recognition

Collaborating Authors

continuous speech recognition

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Continuous Speech Recognition by Linked Predictive Neural Networks

Neural Information Processing SystemsApr-6-2023, 19:38:01 GMT

We present a large vocabulary, continuous speech recognition system based on Linked Predictive Neural Networks (LPNN's). The system uses neu(cid:173) ral networks as predictors of speech frames, yielding distortion measures which are used by the One Stage DTW algorithm to perform continuous speech recognition. The system, already deployed in a Speech to Speech Translation system, currently achieves 95%, 58%, and 39% word accuracy on tasks with perplexity 5, 111, and 402 respectively, outperforming sev(cid:173) eral simple HMMs that we tested. We also found that the accuracy and speed of the LPNN can be slightly improved by the judicious use of hidden control inputs. We conclude by discussing the strengths and weaknesses of the predictive approach.

continuous speech recognition, linked predictive neural network, lpnn

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)

Add feedback

Multi-State Time Delay Networks for Continuous Speech Recognition

Neural Information Processing SystemsApr-6-2023, 19:16:13 GMT

We present the "Multi-State Time Delay Neural Network" (MS-TDNN) as an extension of the TDNN to robust word recognition. The resulting system has the ability to manage the sequential order of subword units. In this paper we present extensive new evaluations of this approach over speaker-dependent and speaker-indepen(cid:173) dent connected alphabet.

continuous speech recognition, multi-state time delay network

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.40)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.34)

Add feedback

Segmental Neural Net Optimization for Continuous Speech Recognition

Neural Information Processing SystemsApr-6-2023, 19:01:33 GMT

Previously, we had developed the concept of a Segmental Neural Net (SNN) for phonetic modeling in continuous speech recognition (CSR). This kind of neu(cid:173) ral network technology advanced the state-of-the-art of large-vocabulary CSR, which employs Hidden Marlcov Models (HMM), for the ARPA 1oo0-word Re(cid:173) source Management corpus. More Recently, we started porting the neural net system to a larger, more challenging corpus - the ARPA 20,Ooo-word Wall Street Journal (WSJ) corpus. During the porting, we explored the following research directions to refine the system: i) training context-dependent models with a reg(cid:173) ularization method; ii) training SNN with projection pursuit; and ii) combining different models into a hybrid system. When tested on both a development set and an independent test set, the resulting neural net system alone yielded a per(cid:173) fonnance at the level of the HMM system, and the hybrid SNN/HMM system achieved a consistent 10-15% word error reduction over the HMM system.

continuous speech recognition, hmm system, segmental neural net optimization, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Hierarchical Mixtures of Experts Methodology Applied to Continuous Speech Recognition

Neural Information Processing SystemsApr-6-2023, 18:33:16 GMT

In this paper, we incorporate the Hierarchical Mixtures of Experts (HME) method of probability estimation, developed by Jordan [1], into an HMM(cid:173) based continuous speech recognition system. The resulting system can be thought of as a continuous-density HMM system, but instead of using gaussian mixtures, the HME system employs a large set of hierarchically organized but relatively small neural networks to perform the probability density estimation. The hierarchical structure is reminiscent of a decision tree except for two important differences: each "expert" or neural net performs a "soft" decision rather than a hard decision, and, unlike ordinary decision trees, the parameters of all the neural nets in the HME are automatically trainable using the EM algorithm. We report results on the ARPA 5,OOO-word and 4O,OOO-word Wall Street Journal corpus using HME models.

continuous speech recognition, expert methodology applied, hierarchical mixture, (2 more...)

Neural Information Processing Systems

Country: Asia > Middle East > Jordan (0.30)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.92)

Add feedback

Minimum Bayes Error Feature Selection for Continuous Speech Recognition

Neural Information Processing SystemsApr-6-2023, 16:59:13 GMT

We consider the problem of designing a linear transformation () E lRPx n, of rank p n, which projects the features of a classifier x E lRn onto y ()x E lRP such as to achieve minimum Bayes error (or probabil(cid:173) ity of misclassification). Two avenues will be explored: the first is to maximize the ()-average divergence between the class densities and the second is to minimize the union Bhattacharyya bound in the range of (). While both approaches yield similar performance in practice, they out(cid:173) perform standard LDA features and show a 10% relative improvement in the word error rate over state-of-the-art cepstral features on a large vocabulary telephony speech recognition task.

continuous speech recognition, minimum bayes error feature selection

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.98)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.67)

Add feedback

Fusayasu

AAAI ConferencesFeb-8-2022, 11:43:54 GMT

In spite of the recent advancements being made in speech recognition, recognition errors are unavoidable in continuous speech recognition. In this paper, we focus on a word-error correction system for continuous speech recognition using confusion networks.Conventional N-gram correction is widely used; however, the performance degrades due to the fact that the N-gram approach cannot measure information between long distance words. In order to improve the performance of the N-gram model, we employ Normalized Relevance Distance (NRD) as a measure for semantic similarity between words. NRD can identify not only co-occurrence but also the correlation of importance of the terms in documents. Even if the words are located far from each other, NRD can estimate the semantic similarity between the words. The effectiveness of our method was evaluated in continuous speech recognition tasks for multiple test speakers. Experimental results show that our error-correction method is the most effective approach as compared to the methods using other features.

continuous speech recognition, fusayasu, speech recognition, (2 more...)

AAAI Conferences

Technology: Information Technology > Artificial Intelligence (1.00)

Add feedback

Continuous Silent Speech Recognition using EEG

Krishna, Gautam, Tran, Co, Carnahan, Mason, Tewfik, Ahmed

arXiv.org Machine LearningFeb-29-2020

In this paper we explore continuous silent speech recognition using electroencephalography (EEG) signals. We implemented a connectionist temporal classification (CTC) automatic speech recognition (ASR) model to translate EEG signals recorded in parallel while subjects were reading English sentences in their mind without producing any voice to text. Our results demonstrate the feasibility of using EEG signals for performing continuous silent speech recognition. We demonstrate our results for a limited English vocabulary consisting of 30 unique sentences.

recognition, silent speech recognition, speech recognition, (14 more...)

arXiv.org Machine Learning

2002.03851

Country:

North America > United States > Texas > Travis County > Austin (0.14)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > Massachusetts (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.35)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.30)

Add feedback

EEG based Continuous Speech Recognition using Transformers

Krishna, Gautam, Tran, Co, Carnahan, Mason, Tewfik, Ahmed H

arXiv.org Machine LearningDec-31-2019

--In this paper we investigate continuous speech recognition using electroencephalography (EEG) features using recently introduced end-to-end transformer based automatic speech recognition (ASR) model. Our results show that transformer based model demonstrate faster inference and training compared to recurrent neural network (RNN) based sequence-to-sequence EEG models but performance of the RNN based models were better than transformer based model during test time on a limited English vocabulary. Continuous speech recognition using non invasive brain signals or electroencephalography (EEG) signals is an emerging area of research where non invasive EEG signals recorded from the scalp of the subject is translated to text. EEG based continuous speech recognition technology enables people with speaking disabilities or people who are not able to speak to have better technology accessibility. Current state-of-the-art voice assistant systems process mainly acoustic input features limiting technology accessibility for people with speaking disabilities or people with no ability to produce voice.

continuous speech recognition, recognition, speech recognition, (12 more...)

arXiv.org Machine Learning

2001.00501

Country:

North America > United States > Texas > Travis County > Austin (0.15)
North America > United States > Texas > Mason County > Mason (0.04)

Genre: Research Report > New Finding (0.70)

Industry:

Health & Medicine > Health Care Technology (0.89)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.55)
Health & Medicine > Therapeutic Area > Neurology (0.49)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Continuous Speech Recognition using EEG and Video

Krishna, Gautam, Carnahan, Mason, Tran, Co, Tewfik, Ahmed H

arXiv.org Machine LearningDec-27-2019

--In this paper we investigate whether electroen-cephalography (EEG) features can be used to improve the performance of continuous visual speech recognition systems. We implemented a connectionist temporal classification (CTC) based end-to-end automatic speech recognition (ASR) model for performing recognition. Our results demonstrate that EEG features are helpful in enhancing the performance of continuous visual speech recognition systems. In recent years there has been lot of interesting work done in the fields of lip reading and audio visual speech recognition. In [1] authors demonstrated end-to-end sentence level lip reading and in [2] authors demonstrated deep learning based end-to- end audio visual speech recognition.

face frame, recognition, speech recognition, (9 more...)

arXiv.org Machine Learning

1912.0773

Country:

North America > United States > Texas > Travis County > Austin (0.15)
North America > United States > Texas > Mason County > Mason (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area (0.69)
Health & Medicine > Diagnostic Medicine (0.47)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Add feedback

Artificial Intelligence Programming in Java

#artificialintelligenceDec-10-2017, 22:27:41 GMT

There is a list of programming languages are available for developing an artificial intelligence project such as Python, POP-11, C, MATLAB, Java, Lisp, and Wolfram language. In this article, you find How Java programming works with Artificial Intelligence. The main feature of Java is Java virtual machine. Java virtual machine is an abstract machine and is available in many hardware and software platform. Java virtual machine performs an operation like loads code, verifies code, provide a runtime environment, and executes code.

artificial intelligence, computer, programming language, (12 more...)

#artificialintelligence

Technology:

Information Technology > Software > Programming Languages (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.50)

Add feedback